Abstract: As the number of users on internet grows the number of accessible web page also grows which causes more troublesome for users to find relevant or specific data according to their needs. Web crawler is that the method utilized by search engines to collect pages from the net. The necessity of an online crawler that downloads most relevant web content from such an oversized internet remains a serious challenge within the field of Information Retrieval Systems. Most internet crawlers use keyword base approach for retrieving the knowledge from Web. However they retrieve several irrelevant web contents as well. With the utilization of linguistics additional relevant pages can be downloaded. Linguistics will be provided by ontology. This paper proposed algorithm on ontology based internet crawler specified such that only relevant sites can be retrieved and estimate best path for crawling which uses for improving the crawling performance.

Keywords: Web Crawler, Focused web crawler, Importance-metrics, Ontology, domain knowledge.